Thrust : A Productivity - Oriented Library for CUDA 26

نویسنده

  • Jared Hoberock
چکیده

This chapter demonstrates how to leverage the Thrust parallel template library to implement high-performance applications with minimal programming effort. Based on the C++ Standard TemplateLibrary (STL), Thrust brings a familiar high-level interface to the realm of GPU Computing whileremaining fully interoperable with the rest of the CUDA software ecosystem. Applications writtenwith Thrust are concise, readable, and efficient.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallelization of Weighted Sequence Comparision by Using Ebwt

In this paper, we describe the design of high-performance extended burrow wheeler transform based weighted sequence comparison algorithm for many core GPUs taking advantages of the full programmability offered by compute unified device architecture (CUDA) and its standard library thrust. Our extended burrow wheeler transform based weighted sequence comparison algorithm with thrust library imple...

متن کامل

Hydra: a C++11 framework for data analysis in massively parallel platforms

Hydra is a header-only, templated and C++11-compliant framework designed to perform the typical bottleneck calculations found in common HEP data analyses on massively parallel platforms. The framework is implemented on top of the C++11 Standard Library and a variadic version of the Thrust library and is designed to run on Linux systems, using OpenMP, CUDA and TBB enabled devices. This contribut...

متن کامل

GPU Acceleration for the C++ Standard Template Library

Modern programmers must exploit parallelism for performance gains, possibly through the use of an attached or on-chip GPU. To take advantage of the GPU in C++ programs, the programmer must use either a new language (CUDA or OpenCL) or an external library (Thrust). Rather than requiring that programmers learn new tools, modify existing code, and change software development practices, the C++ Sta...

متن کامل

Algorithmic Improvements for Portable Event-Based Monte Carlo Transport Using the Nvidia Thrust Library

High performance computing environments are progressively moving towards many-core computing architectures. The Los Alamos National Laboratory Trinity machine, available in late 2016, will use both Intel Xeon Haswell processors and Intel Xeon Phi Knights Landing many integrated core (MIC) coprocessors. The Lawrence Livermore National Laboratory Sierra machine, available in 2018, will use an IBM...

متن کامل

Near Real-time Pointcloud Smoothing for 3D Imaging Systems

In this project a GPU-based implementation of Moving Least Squares is presented for smoothing 3D pointclouds. We used an Xbox Kinect to generate spatial data and coded our algorithm using CUDA with the Thrust library. Our implementation uses an organized set of points and can be computed at ~7 Hz on a Nvidia Quadro FX 4800. While perhaps not directly comparable, it has a speedup of between 30-6...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011